Skip to content

Optimize sys.indexes index_id computation for correlated subquery access#4849

Open
RuchaSK1 wants to merge 4 commits into
babelfish-for-postgresql:BABEL_6_X_DEVfrom
amazon-aurora:jira-babel-6458-part2
Open

Optimize sys.indexes index_id computation for correlated subquery access#4849
RuchaSK1 wants to merge 4 commits into
babelfish-for-postgresql:BABEL_6_X_DEVfrom
amazon-aurora:jira-babel-6458-part2

Conversation

@RuchaSK1

@RuchaSK1 RuchaSK1 commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Description

sys.indexes computes index_id using a materialized CTE with row_number(). The MATERIALIZED keyword forces PostgreSQL to evaluate all indexes in the database upfront, regardless of how many are actually needed by the caller.

This is efficient for bulk access (SELECT * FROM sys.indexes) but becomes a bottleneck when sys.indexes is accessed per-row in correlated subqueries which is a pattern in SSMS Expanding tables query.

In such cases, the full CTE is scanned repeatedly (once per outer row), leading to execution times that grow linearly with both the number of tables and total indexes in the database.

Fix

We replace the materialized CTE with an inline scalar subquery that computes index_id on demand by counting sibling indexes via pg_index_indrelid_index:

  • Clustered index → index_id = 1
  • Non-clustered → 2 + count of earlier non-clustered indexes on the same table

For per-row access, this reads only the 1-2 sibling indexes on the same table.

Performance Testing

Tested on 17.10 instance with ~37K objects (7,500 tables, 15,751 indexes).

Query Baseline With Fix Change
SSMS Expanding Tables (with OID→INT4 fix) 35s 13s 2.7× faster
SELECT t.name, i.name FROM sys.tables t JOIN sys.indexes i ON t.object_id = i.object_id AND i.index_id = 1 WHERE t.is_ms_shipped = 0 39.5s 84ms 470× faster
SELECT t.name, i.name, i.index_id FROM sys.tables t JOIN sys.indexes i ON t.object_id = i.object_id WHERE t.is_ms_shipped = 0 (all indexes per table) 328ms 196ms 1.7× faster
SELECT * FROM sys.indexes (bulk, all 15751 indexes) 188ms 190ms No regression
Point lookup on 50-index table 27ms 3ms No regression

Issues Resolved

Task: BABEL-6816

Authored-by: Rucha Kulkarni ruchask@amazon.com
Signed-off-by: Rucha Kulkarni ruchask@amazon.com

Test Scenarios Covered

  • Use case based -

  • Boundary conditions -

  • Arbitrary inputs -

  • Negative test cases -

  • Minor version upgrade tests -

  • Major version upgrade tests -

  • Performance tests - Yes

  • Tooling impact -

  • Client tests -

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Rucha Kulkarni added 4 commits June 3, 2026 19:13
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant